33 research outputs found

    Intelligent Cooperative Control Architecture: A Framework for Performance Improvement Using Safe Learning

    Get PDF
    Planning for multi-agent systems such as task assignment for teams of limited-fuel unmanned aerial vehicles (UAVs) is challenging due to uncertainties in the assumed models and the very large size of the planning space. Researchers have developed fast cooperative planners based on simple models (e.g., linear and deterministic dynamics), yet inaccuracies in assumed models will impact the resulting performance. Learning techniques are capable of adapting the model and providing better policies asymptotically compared to cooperative planners, yet they often violate the safety conditions of the system due to their exploratory nature. Moreover they frequently require an impractically large number of interactions to perform well. This paper introduces the intelligent Cooperative Control Architecture (iCCA) as a framework for combining cooperative planners and reinforcement learning techniques. iCCA improves the policy of the cooperative planner, while reduces the risk and sample complexity of the learner. Empirical results in gridworld and task assignment for fuel-limited UAV domains with problem sizes up to 9 billion state-action pairs verify the advantage of iCCA over pure learning and planning strategies

    Value Iteration for Simple Stochastic Games: Stopping Criterion and Learning Algorithm

    Full text link
    Simple stochastic games can be solved by value iteration (VI), which yields a sequence of under-approximations of the value of the game. This sequence is guaranteed to converge to the value only in the limit. Since no stopping criterion is known, this technique does not provide any guarantees on its results. We provide the first stopping criterion for VI on simple stochastic games. It is achieved by additionally computing a convergent sequence of over-approximations of the value, relying on an analysis of the game graph. Consequently, VI becomes an anytime algorithm returning the approximation of the value and the current error bound. As another consequence, we can provide a simulation-based asynchronous VI algorithm, which yields the same guarantees, but without necessarily exploring the whole game graph.Comment: CAV201

    Special Agents Can Promote Cooperation in the Population

    Get PDF
    Cooperation is ubiquitous in our real life but everyone would like to maximize her own profits. How does cooperation occur in the group of self-interested agents without centralized control? Furthermore, in a hostile scenario, for example, cooperation is unlikely to emerge. Is there any mechanism to promote cooperation if populations are given and play rules are not allowed to change? In this paper, numerical experiments show that complete population interaction is unfriendly to cooperation in the finite but end-unknown Repeated Prisoner's Dilemma (RPD). Then a mechanism called soft control is proposed to promote cooperation. According to the basic idea of soft control, a number of special agents are introduced to intervene in the evolution of cooperation. They comply with play rules in the original group so that they are always treated as normal agents. For our purpose, these special agents have their own strategies and share knowledge. The capability of the mechanism is studied under different settings. We find that soft control can promote cooperation and is robust to noise. Meanwhile simulation results demonstrate the applicability of the mechanism in other scenarios. Besides, the analytical proof also illustrates the effectiveness of soft control and validates simulation results. As a way of intervention in collective behaviors, soft control provides a possible direction for the study of reciprocal behaviors

    A theory to devise dependable cooperative encounters

    No full text
    International audienceIn this paper, we investigate the question of how to characterize " fault tolerance " in cooperative agents. It is generally admitted that cooperating agents can achieve tasks that they could not achieve without cooperation. Nevertheless, cooperating agents can have " Achilles' heels " , a cooperative encounter can eventually fail to achieve its tasks because of the collapse of a single agent. The contribution of this paper is the study of how cooperating agents are affected by dependability issues. Specifically, our objectives are twofold: to formally define the concepts of dependability in cooperative encounters, and to analyze the computational complexity of devising dependable cooperative encounters

    Behavior Matching by Observation for Multi-Robot Cooperation

    No full text
    Based on formal analysis of qualitative robot behaviors, a criterion for classifying multi-robot tasks to either static or dynamic is presented. It suggests that common resource recognition is essential in dynamic cooperative tasks. In the presented framework Cooperation by Observation, a robot finds a common resource and chooses an appropriate helpful behavior by observing other agents' behaviors. Demonstrative experiments with real mobile robots are presented. The robots have actively controlled binocular vision and controlled by an extended behavior based architecture. The vital part of the architecture consists of attentional buffers, which handles behavior coordination by temporarily remembering the common resource and initializing new behaviors using it. 1 Introduction Over the last decade, various cooperative frameworks for decentralized tacit multiple robot systems have been developed, which have made significant academic contributions. However, if we step back and view the wh..

    6D Localization and Kicking for Humanoid Robotic Soccer

    No full text
    corecore